---
title: "Local participation and not unemployment explains the M5S result in the South"
author: "Francesco Bailo"
date: "3/19/2018"
output: 
  html_document: 
    self_contained: no
    fig_caption: yes
---

<style>
div.figure {
  width: 100%;
  text-align: center;
  font-style: italic;
  font-size: smaller;
  text-indent: 0;
  border: thin silver solid;
  margin: 0.5em;
  padding: 0.5em;
}

table {
    margin-left:auto; 
    margin-right:auto;
    text-align: left;
    border-collapse: collapse;}
caption {
    font-weight: bold;
    text-align: left;}
table th {
    font-size: 95%;
    font-weight: bold;
    padding: 8px;
    border-bottom: 1px solid #000000;}
table td {
    font-size: 80%;
    padding: 4px;
    border-bottom: 1px solid #fff;
    border-top: 1px solid transparent;}
tbody tr:hover td {
    background: #d0dafd;
    color: #339;}
table tr:nth-child(even) {
    background-color: #e8edff;}
tfoot {
    font-size: 78%;}

</style>

The abundance of economic data and the scarcity of social data with a comparable level of granularity is a problem for the quantitative analysis of social phenomena. I argue that this fundamental problem has misguided the analysis of the electoral results of the Five Star Movement (M5S) and its interpretation. In this article, I provide statistical evidence suggesting that --- in the South --- unemployment is not associated with the exceptional increase in the M5S support and that local participation is a stronger predictor of support than most of the demographics. 

## What happened

```{r include=FALSE}
require(knitr)
opts_chunk$set(echo = FALSE, message = F, warning = FALSE, cache = T, fig.retina=3)
```


```{r setup, include=FALSE, cache = F}

mapeval = TRUE

require(rgdal)

require(RColorBrewer)

require(stargazer)

library(DBI)
library(RPostgreSQL)
library(rpostgis)
options(scipen=999)

conn <- dbConnect(
  drv = PostgreSQL(),
  dbname = "istat_sez2011",
  host = "localhost",
  port = "5432",
  user = "francesco",
  password = "")

require(reshape2)

comuni_sp <- readOGR('/Users/francesco/public_git/scrape_2018_ita_ge_results/output_shapefiles/', layer = 'istat_comuni_2018')

comuni2018_camera2013 <- dbReadTable(conn, c("ita_ge18", "comuni2018_camera2013"))

# Dissolve
comuni_to_merge <- 
  as.character(comuni_sp$PRO_COM_T[!comuni_sp$PRO_COM_T %in% comuni2018_camera2013$PRO_COM_T_2018])
comuni_sp$PRO_COM_T <- as.character(comuni_sp$PRO_COM_T)
comuni_sp$PRO_COM_T[comuni_sp$PRO_COM_T %in% comuni_to_merge] <- 'dissolved1'
  
# require(rmapshaper)
# comuni_sp <- ms_dissolve(comuni_sp, field = 'PRO_COM_T')

comuni2018_candidate_votes <- 
  dbReadTable(conn, c("ita_ge18", "comuni2018_camera_candidate_votes"))
comuni2018_party_votes <- 
  dbReadTable(conn, c("ita_ge18", "comuni2018_camera_party_votes"))

comuni_sp <- merge(comuni_sp, comuni2018_camera2013[,c('PRO_COM_T_2018','PDL','LN','M5S','PD','ppdt_nRegisteredVoters', 'ppdt_nAssignedVotes')], by.x='PRO_COM_T', by.y='PRO_COM_T_2018')
names(comuni_sp)[13:18] <- paste0(names(comuni_sp)[13:18], "_2013")

comuni2018_candidate_votes <- subset(comuni2018_candidate_votes, coalizione == 'M5S')
comuni2018_candidate_votes <- comuni2018_candidate_votes[,c('PRO_COM_T','votes')]
names(comuni2018_candidate_votes) <- c("PRO_COM_T","M5S")

# test <- comuni2018_party_votes[,c("PRO_COM_T","partito")]
# comuni2018_party_votes$dup <- duplicated(test) | duplicated(test, fromLast = T)

comuni2018_party_votes <- 
  dcast(data = subset(comuni2018_party_votes, partito %in% c("PD","FI","Lega")), 
        formula = PRO_COM_T ~ partito, value.var = 'votes')

comuni2018_place_stats <- 
  dbReadTable(conn, c("ita_ge18", "comuni2018_camera_place_stats"))
names(comuni2018_place_stats)[names(comuni2018_place_stats) == 'ppdt_nEligibleVoters'] <-
  "ppdt_nRegisteredVoters"

comuni2018_party_votes <- merge(comuni2018_party_votes, comuni2018_place_stats, by = 'PRO_COM_T')

comuni2018_votes <- merge(comuni2018_party_votes, comuni2018_candidate_votes, by = 'PRO_COM_T')
names(comuni2018_votes)[2:9] <- paste0(names(comuni2018_votes)[2:9], "_2018")
comuni_sp <- merge(comuni_sp, comuni2018_votes, by = 'PRO_COM_T')

comuni_sp$M5S_diff <- 
  with(as.data.frame(comuni_sp), 
       ((M5S_2018 / ppdt_nRegisteredVoters_2018) - (M5S_2013 / ppdt_nRegisteredVoters_2013)) * 100)
comuni_sp$PD_diff <- 
  with(as.data.frame(comuni_sp), 
       ((PD_2018 / ppdt_nRegisteredVoters_2018) - (PD_2013 / ppdt_nRegisteredVoters_2013)) * 100)
comuni_sp$Lega_diff <- 
  with(as.data.frame(comuni_sp), 
       ((Lega_2018 / ppdt_nRegisteredVoters_2018) - (LN_2013 / ppdt_nRegisteredVoters_2013)) * 100)
comuni_sp$FI_diff <- 
  with(as.data.frame(comuni_sp), 
       ((FI_2018 / ppdt_nRegisteredVoters_2018) - (PDL_2013 / ppdt_nRegisteredVoters_2013)) * 100)

breaks <- c(-100, -50, -20, -10, -5, -2, 2, 5, 10, 20, 50)

for (var  in c("M5S_diff","PD_diff", "Lega_diff", "FI_diff")) {
  comuni_sp[[var]][comuni_sp$sezioni_scrutinate_2018/comuni_sp$sezioni_totali_2018<1] <- NA
  comuni_sp[[var]][is.infinite(comuni_sp[[var]])] <- NA
  comuni_sp[[paste0(var, "_brk")]] <- cut(comuni_sp[[var]], 
                                          breaks = c(breaks, max(comuni_sp[[var]], na.rm = T)),
                                          labels = c("-50%", "-20%" , 
                                                     "-10%", "-5%", 
                                                     "-2%", "0%", 
                                                     "+2%", "+5%", "+10%",
                                                     "+20%", "+50%"))
}


comuni_df <- as.data.frame(comuni_sp)

assignMacroRegion <- function(x) {
  if (x %in% c('1','3',
               '7',"2")) {
    return('North-west')
  }
  if(x %in% c('5','6',
              "4",
              "8")) {
    return('North-east')
  }
  if (x %in% c("9", "10", "11", "12")) {
    return('Centre')
  }
  if (x %in% c("13", "16", "17", 
               "15", "18", "14")) {
    return('South')
  }
  if (x %in% c("19", "20")) {
    return("Islands")
  }
}

comuni_df$macro_region <- sapply(as.character(comuni_df$COD_REG), assignMacroRegion)
comuni_df$macro_region <- factor(comuni_df$macro_region, 
                                 levels = c("North-west", "North-east",
                                            "Centre", "South", "Islands" ))

res_geo_details <- 
  dbGetQuery(conn, 'SELECT "PRO_COM_T",
             ST_Area(geom) AS area
             FROM ita_ge18.comuni2018_geo;')

res_demographics <-
  dbGetQuery(conn, "SELECT * FROM ita_ge18.comuni2018_census2011_demographics;")

comuni_df <- merge(comuni_df, res_geo_details, by = 'PRO_COM_T')
comuni_df <- merge(comuni_df, res_demographics, by = 'PRO_COM_T')

comuni_df$pop_density <- comuni_df$pop2011/comuni_df$area
comuni_df$unemployment <- comuni_df$popunemployed2011/comuni_df$popnotworkforce2011

comuni_df$housewive_perc <- comuni_df$pophousewife2011/comuni_df$popnotworkforce2011
comuni_df$over65_perc <- comuni_df$pop65over2011/comuni_df$pop2011
comuni_df$degree_perc <- comuni_df$popdegree2011/comuni_df$pop2011

comuni_df$foreignpop_perc <- comuni_df$foreignpop2011/comuni_df$pop2011
comuni_df$foreignpop_africa_perc <- comuni_df$foreignpopafrica2011/comuni_df$pop2011

comuni_df$turnout <- comuni_df$ppdt_nAssignedVotes_2018/comuni_df$ppdt_nRegisteredVoters_2018

comuni_df <- subset(comuni_df, sezioni_scrutinate_2018 > 0)

redditi2016 <- dbReadTable(conn, c("ita_ge18","comuni2018_redditi2016"))
comuni_df <- merge(comuni_df, redditi2016, by = 'PRO_COM_T', all.x = T)

```


The 2018 Italian general elections (*elections*, since both the Chamber of Deputies and the Senate were renewed) saw 

1) a significant increase in the number of votes for two parties, the Five Start Movement (M5S) and the League (formerly Northern League), 

and

2) an increase in the importance geography as explanatory dimension for the distribution of votes. 

The following two maps show where the M5S and the League has increased electoral support from 2013 to 2018.

```{r map1, cache = T, fig.cap = "Vote difference: 2018-2013 (a few communes have not reported all the results, notably Rome)", eval = mapeval}
rdylbu.palette <- rev(brewer.pal(n = 11, name = "RdYlBu"))
spplot(comuni_sp, c('M5S_diff_brk','Lega_diff_brk'), 
       col.regions = rdylbu.palette, col = "transparent",
       names.attr = c('M5S', 'League'))
```

The geographic pattern is quite simple. The M5S has increased its support in the South and maintained its votes in the North, the League has significantly strengthened its support in the North but has also collected votes in the South, where it had virtually no support. The third and the fourth most voted parties, the Democratic Party (PD) and Berlusconi's Forza Italia (FI), have lost votes almost everywhere. If we map the results of the four parties side-by-side with the same scale, the PD and FI almost faded into the background.

```{r setup2, include = FALSE}
library(osmdata)
library(dplyr)
require(rgdal)
library(DBI)
library(RPostgreSQL)
library(rpostgis)
options(scipen=999)

conn <- dbConnect(
  drv = PostgreSQL(),
  dbname = "istat_sez2011",
  host = "localhost",
  port = "5432",
  user = "francesco",
  password = "")

camera_sp <- readOGR('/Users/francesco/public_git/scrape_2018_ita_ge_results/output_shapefiles/', layer = 'minint_camera')

res_camera <- dbGetQuery(conn, "SELECT votes, perc, minint_code FROM ita_ge18.minint_camera_party_votes WHERE partito = 'Lega';")


res_camera_other_candidate <- dbGetQuery(conn, "SELECT perc AS \"M5S\", minint_code FROM ita_ge18.minint_camera_candidate_votes WHERE coalizione = 'M5S';")

res_camera_other_party <- dbGetQuery(conn, "SELECT * FROM ita_ge18.minint_camera_party_votes WHERE 
                               partito = 'PD' OR  partito = 'Lega' OR  partito = 'FI';")
require(reshape2)
res_camera_other_party <- 
  dcast(data = res_camera_other_party, 
        formula = minint_code ~ partito, value.var = 'perc')
res_place_stats <- dbGetQuery(conn, "SELECT * FROM ita_ge18.minint_camera_place_stats;")

res_camera <- merge(res_camera, res_camera_other_candidate,
                    by = 'minint_code')
res_camera <- merge(res_camera, res_camera_other_party,
                    by = 'minint_code')
res_camera <- merge(res_camera, 
                    res_place_stats, 
                    by = 'minint_code')

camera_sp <- merge(camera_sp, res_camera, by.x = 'minint_cod', by.y = 'minint_code')

assignMacroRegion <- function(x) {
  if (x %in% c('Piemonte','Lombardia',
               'Liguria',"Valle d'Aosta/Vallée d'Aoste")) {
    return('North-west')
  }
  if(x %in% c('Veneto','Friuli-Venezia Giulia',
              "Trentino-Alto Adige/Südtirol",
              "Emilia-Romagna")) {
    return('North-east')
  }
  if (x %in% c("Toscana", "Umbria", "Marche", "Lazio")) {
    return('Centre')
  }
  if (x %in% c("Abruzzo", "Puglia", "Basilicata", 
               "Campania", "Calabria", "Molise")) {
    return('South')
  }
  if (x %in% c("Sicilia", "Sardegna")) {
    return("Islands")
  }
}

res_geo_details <- 
  dbGetQuery(conn, "SELECT minint_geopol_cod AS minint_cod,
             ST_Area(ST_Transform(geom, 32632)) AS area
             FROM ita_ge18.subcomuni2018_camera_geo;")

res_demographics <-
  dbGetQuery(conn, "SELECT * FROM ita_ge18.subcomuni2018_census2011_demographics;")


```

```{r map2, fig.cap = "Votes in the 2018 General elections (Chamber)", eval = mapeval}
require(RColorBrewer)
ylgnbu.palette <- brewer.pal(n = 9, name = "YlOrRd")
camera_sp$PD_perc <- camera_sp$PD*100
camera_sp$M5S_perc <- camera_sp$M5S*100
camera_sp$Lega_perc <- camera_sp$Lega*100
camera_sp$FI_perc <- camera_sp$FI*100

spplot(camera_sp, c('M5S_perc','Lega_perc', 'PD_perc', 'FI_perc'), 
       col.regions = ylgnbu.palette, col = "transparent",
       names.attr = c('M5S', 'League', 'PD', 'FI'),
       cuts = 8)
```

Yet, major metropolitan areas do not always followed the national trend. If Naples unambiguously voted M5S, Turin, Milan and Rome did saw the Democratic Party as the most voted party in the wealthiest districts. 

```{r map2turin, fig.cap = "Votes in the 2018 General elections (Chamber, Turin)", eval = mapeval}

# Torino
h_primary <- 
  opq(bbox = c(7.773856,44.96797,7.540799,45.147403)) %>%
  add_osm_feature(key = 'highway', value = 'primary') %>%
  osmdata_sp()

h_secondary <- 
  opq(bbox = c(7.794804,44.963269,7.550758,45.137329)) %>%
  add_osm_feature(key = 'highway', value = 'secondary') %>%
  osmdata_sp()

h_residential <- 
  opq(bbox = c(7.794804,44.963269,7.550758,45.137329)) %>%
  add_osm_feature(key = 'highway', value = 'residential') %>%
  osmdata_sp()

turin_hp = list("sp.lines", h_primary$osm_lines, alpha = 0.3)
turin_hs = list("sp.lines", h_secondary$osm_lines, alpha = 0.2)
turin_hr = list("sp.lines", h_residential$osm_lines, alpha = 0.1)

spplot(camera_sp, c('M5S_perc','Lega_perc', 'PD_perc', 'FI_perc'), 
       col.regions = ylgnbu.palette, col = "black",
       names.attr = c('M5S', 'League', 'PD', 'FI'),
       cuts = 8,
       xlim = c(7.540799, 7.773856),
       ylim = c(44.96797, 45.147403),
       sp.layout = list(turin_hp, turin_hs, turin_hr))
```

```{r map2milan, fig.cap = "Votes in the 2018 General elections (Chamber, Milan)", eval = mapeval}

h_primary <- 
  opq(bbox = c(9.290346,45.389779,9.065118,45.535689)) %>%
  add_osm_feature(key = 'highway', value = 'primary') %>%
  osmdata_sp()

h_secondary <- 
  opq(bbox = c(9.290346,45.389779,9.065118,45.535689)) %>%
  add_osm_feature(key = 'highway', value = 'secondary') %>%
  osmdata_sp()

h_residential <- 
  opq(bbox = c(9.290346,45.389779,9.065118,45.535689)) %>%
  add_osm_feature(key = 'highway', value = 'residential') %>%
  osmdata_sp()

milan_hp = list("sp.lines", h_primary$osm_lines, alpha = 0.3)
milan_hs = list("sp.lines", h_secondary$osm_lines, alpha = 0.2)
milan_hr = list("sp.lines", h_residential$osm_lines, alpha = 0.1)

spplot(camera_sp, c('M5S_perc','Lega_perc', 'PD_perc', 'FI_perc'), 
       col.regions = ylgnbu.palette, col = "black",
       names.attr = c('M5S', 'League', 'PD', 'FI'),
       cuts = 8,
       xlim = c(9.065118, 9.290346),
       ylim = c(45.389779, 45.535689),
       sp.layout = list(milan_hp, milan_hs, milan_hr))
```

```{r map2rome, fig.cap = "Votes in the 2018 General elections (Chamber, Rome)", cache = T, eval = mapeval}

h_primary <- 
  opq(bbox = c(12.687966,41.75491,12.2821,42.032751)) %>%
  add_osm_feature(key = 'highway', value = 'primary') %>%
  osmdata_sp()

h_secondary <- 
  opq(bbox = c(12.687966,41.75491,12.2821,42.032751)) %>%
  add_osm_feature(key = 'highway', value = 'secondary') %>%
  osmdata_sp()

h_residential <- 
  opq(bbox = c(12.687966,41.75491,12.2821,42.032751)) %>%
  add_osm_feature(key = 'highway', value = 'residential') %>%
  osmdata_sp()

rome_hp = list("sp.lines", h_primary$osm_lines, alpha = 0.3)
rome_hs = list("sp.lines", h_secondary$osm_lines, alpha = 0.2)
rome_hr = list("sp.lines", h_residential$osm_lines, alpha = 0.1)

spplot(camera_sp, c('M5S_perc','Lega_perc', 'PD_perc', 'FI_perc'), 
       col.regions = ylgnbu.palette, col = "black",
       names.attr = c('M5S', 'League', 'PD', 'FI'),
       cuts = 8,
       xlim = c(12.2821, 12.687966),
       ylim = c(41.75491, 42.032751),
       sp.layout = list(rome_hp, rome_hs, rome_hr))
```

```{r map2naples, fig.cap = "Votes in the 2018 General elections (Chamber, Naples)" , cache = T, eval = mapeval}

h_primary <- 
  opq(bbox = c(14.433709,40.71792,14.068722,40.982112)) %>%
  add_osm_feature(key = 'highway', value = 'primary') %>%
  osmdata_sp()

h_secondary <- 
  opq(bbox = c(14.433709,40.71792,14.068722,40.982112)) %>%
  add_osm_feature(key = 'highway', value = 'secondary') %>%
  osmdata_sp()

h_residential <- 
  opq(bbox = c(14.433709,40.71792,14.068722,40.982112)) %>%
  add_osm_feature(key = 'highway', value = 'residential') %>%
  osmdata_sp()

napoli_hp = list("sp.lines", h_primary$osm_lines, alpha = 0.3)
napoli_hs = list("sp.lines", h_secondary$osm_lines, alpha = 0.2)
napoli_hr = list("sp.lines", h_residential$osm_lines, alpha = 0.1)

spplot(camera_sp, c('M5S_perc','Lega_perc', 'PD_perc', 'FI_perc'), 
       col.regions = ylgnbu.palette, col = "black",
       names.attr = c('M5S', 'League', 'PD', 'FI'),
       cuts = 8,
       xlim = c(14.068722, 14.433709),
       ylim = c(40.71792, 40.982112),
       sp.layout = list(napoli_hp, napoli_hs, napoli_hr))
```

The density of the distribution of results at commune and sub-commune level in the macro regions indicates that if the M5S electorally dominates in the South and in the two major islands, the League is the most popular party in the North.

```{r setup3}
require(ggplot2)
require(scales)

camera_df <- as.data.frame(camera_sp)
camera_df$macro_region <- sapply(camera_df$regione, assignMacroRegion)
camera_df$macro_region <- factor(camera_df$macro_region, 
                                 levels = c("North-west", "North-east",
                                            "Centre", "South", "Islands" ))

camera_df <- merge(camera_df, res_geo_details, by = 'minint_cod')
camera_df <- merge(camera_df, res_demographics, by.x = 'minint_cod', by.y = 'minint_geopol_cod')

camera_df$pop_density <- camera_df$pop2011/camera_df$area
camera_df$unemployment <- camera_df$popunemployed2011/camera_df$popnotworkforce2011

camera_df$housewive_perc <- camera_df$pophousewife2011/camera_df$popnotworkforce2011
camera_df$over65_perc <- camera_df$pop65over2011/camera_df$pop2011
camera_df$degree_perc <- camera_df$popdegree2011/camera_df$pop2011

camera_df$foreignpop_perc <- camera_df$foreignpop2011/camera_df$pop2011
camera_df$foreignpop_africa_perc <- camera_df$foreignpopafrica2011/camera_df$pop2011

camera_df <- subset(camera_df, sezioni_scrutinate > 0)

```   

```{r, fig.cap=c("Distribution of votes at commune or subcommune level")}
require(reshape2)
camera_df_melted <- melt(camera_df, id.vars = c("minint_cod","macro_region"),
                         measure.vars = c('PD','M5S','Lega','FI'))

camera_df_melted$macro_region <- factor(camera_df_melted$macro_region,
                                        levels = c('North-west','North-east','Centre',
                                                   'South','Islands','Country'))

ggplot(camera_df_melted, aes(value, fill = variable)) + 
  geom_density(alpha = 0.6) + 
  geom_density(data = transform(camera_df_melted, 
                                macro_region = factor("Country",
                                                      levels = c('North-west','North-east','Centre',
                                                   'South','Islands','Country'))), alpha = 0.6) + 
  theme_bw() + 
  scale_x_continuous(labels=percent) + 
  labs(x=NULL) + 
  facet_wrap(~macro_region) +
  guides(fill=guide_legend(title="Party"))
```

The territoriality of the results, especially along the North-South dimension, makes the analysis especially complicated. This because the strong result of the League in the North and of the M5S in the South might simplistically suggest that immigration (which is much stronger in the North) explains the League's result in the North and unemployment and poverty (stronger in the South) explain the M5S's result in the South. This reading is especially attractive, since immigration and the M5S proposal to introduce a guaranteed minim income have dominated the campaign. 

## Correlates of the M5S votes

What the electoral geography clearly describes is an antithetic and territorially driven relationship between the M5S and the League. The M5S vote is strongly and negatively correlated with the League vote. 

```{r fig.cap = 'M5S and Lega vote (%, provinces of Trento and Bolzano excluded)'}
ggplot(subset(camera_df, regione != "Trentino-Alto Adige/Südtirol"), aes(M5S, Lega)) + 
  geom_point(aes(color = macro_region), alpha = .85) + 
  geom_smooth(se=FALSE) +
    theme_bw() + 
  scale_x_continuous(labels=percent) + 
  scale_y_continuous(labels=percent) 
```

When we exclude the provinces of Trento and Bolzano (because of their autonomous status have a different party tradition), the correlation between the League and the M5S vote is 80%. No other two parties share the same correlation:

```{r, fig.cap = 'Correlation of the vote for the four main parties'}
require(GGally)
ggpairs(subset(camera_df, regione != "Trentino-Alto Adige/Südtirol", select = c("M5S","Lega","PD","FI"))*100) +
  theme_bw()
```


In a **first model**, I use data from the lower geographic level possible --- either the commune level or the sub-commune level for large metropolitan areas. I then test the association of the results of each of the `r nrow(camera_df)` geographies with the census demographics from 2011.  

```{r}
camera_df$partwf_perc <- camera_df$popnotworkforce2011/(camera_df$pop2011-camera_df$pop65over2011)
base_formula <- "log(pop_density)+unemployment+housewive_perc+partwf_perc+over65_perc+degree_perc+foreignpop_perc+foreignpop_africa_perc"
m5s_camera_mod_national <- lm(data = camera_df, formula = paste0("M5S~",base_formula))
m5s_camera_mod_centrenorth <- 
  lm(data = subset(camera_df, macro_region %in% c('North-west','North-east','Centre')), formula = paste0("M5S~",base_formula))
m5s_camera_mod_southislands <- 
  lm(data = subset(camera_df, macro_region %in% c('South','Islands')), formula = paste0("M5S~",base_formula))
m5s_camera_mod_south <- 
  lm(data = subset(camera_df, macro_region %in% c('South')), formula = paste0("M5S~",base_formula))
m5s_camera_mod_islands <- 
  lm(data = subset(camera_df, macro_region %in% c('Islands')), formula = paste0("M5S~",base_formula))
```

```{r, results = 'asis'}
stargazer(m5s_camera_mod_national, 
         m5s_camera_mod_centrenorth, 
         m5s_camera_mod_southislands,
         m5s_camera_mod_south,
         m5s_camera_mod_islands,
         type = 'html', 
         omit.stat = c("f","ser"),
         dep.var.labels = "M5S %",
         covariate.labels = c("Pop. density (log)", "Unemployment %", "Housewives %",
                              "Not workforce below 65 %",
                              "Over 65%", "With degree %", "Foreign pop. %", 
                              "Foreign African pop. %"),
         column.labels = c("Country", "Centre-north", "South and islands", 
                           "South", "Islands"))
```

Results indicate that if unemployment is strongly associated at the  national level, it is not significant when we only considered the South (Sicily and Sardinia included). When we exclude the two main islands, unemployment is again significant but the association between M5S support and unemployment is weaker than between M5S and every other demographics with the exception of the percentage of population with a university degree. And if we only consider the two main islands unemployment is actually negatively correlated with the M5S.

```{r, fig.cap = 'Unemployment and M5S vote in different macro regions'}
ggplot(camera_df, aes(M5S, unemployment)) + 
  geom_point(alpha = 0.6, size = .4) + 
  geom_point(data = transform(camera_df, 
                                macro_region = factor("Country",
                                                      levels = c('North-west','North-east','Centre',
                                                   'South','Islands','Country'))), alpha = 0.6,
             size = .4) + 
  geom_smooth(se = FALSE, method = 'loess') + 
  geom_smooth(data = transform(camera_df, 
                                macro_region = factor("Country",
                                                      levels = c('North-west','North-east','Centre',
                                                   'South','Islands','Country'))),
             se = FALSE, method = 'loess') + 
  theme_bw() + 
  scale_x_continuous(labels=percent) + 
  scale_y_continuous(labels=percent) + 
  labs(x=NULL) + 
  facet_wrap(~macro_region)
```

The unemployment rate is significantly higher in the South; almost all the geographies with an unemployment rate above 40% are indeed in the South. It is possible to assume a non-linear relation between M5S support and unemployment: for example, strong below 40% and null above. Still this is not fully supported by a locally weighted regression line (LOESS) fitted to the data generated by the elections in the different macro regions (see Figure above). And it does not explain the behaviour in Sicily and Sardinia (with a median support for the M5S of `r round(median(subset(camera_df, macro_region %in% c('Islands'))$M5S)*100,0)`% against a national median of `r round(median(camera_df$M5S)*100,0)`%) where unemployment is actually associated with *less* support for the M5S.

In a **second model**, I use the results of the 2018 elections at the commune level and the results of the 2013 elections. This allows to introduce a variable capturing the *relative* change in support for the Movement. 

```{r}

comuni_df$reddito_percap <- (comuni_df$redditoAmmontare / comuni_df$pop2011) / 1000
comuni_df$partwf_perc <- comuni_df$popnotworkforce2011/(comuni_df$pop2011-comuni_df$pop65over2011)


base_formula <- paste0(base_formula, "+reddito_percap")
m5s_camera_moddiff_national <- 
  lm(data = comuni_df, formula = paste0("M5S_diff~",base_formula))

m5s_camera_moddiff_centrenorth <- 
  lm(data = subset(comuni_df, macro_region %in% c('North-west','North-east','Centre')), 
     formula = paste0("M5S_diff~",base_formula))

m5s_camera_moddiff_southislands <- 
  lm(data = subset(comuni_df, macro_region %in% c('South','Islands')), 
     formula = paste0("M5S_diff~",base_formula))

m5s_camera_moddiff_south <- 
  lm(data = subset(comuni_df, macro_region %in% c('South')), 
     formula = paste0("M5S_diff~",base_formula))

m5s_camera_moddiff_islands <- 
  lm(data = subset(comuni_df, macro_region %in% c('Islands')), 
     formula = paste0("M5S_diff~",base_formula))

```

```{r, results = 'asis'}

stargazer(m5s_camera_moddiff_national, 
         m5s_camera_moddiff_centrenorth, 
         m5s_camera_moddiff_southislands,
         m5s_camera_moddiff_south,
         m5s_camera_moddiff_islands,
         type = 'html', 
         omit.stat = c("f","ser"),
         dep.var.labels = "M5S (% difference 2018-2013)",
         covariate.labels = c("Pop. density (log)", "Unemployment %", 
                              "Housewives %", "Not workforce below 65 %", 
                              "Over 65%", "With degree %", "Foreign pop. %", 
                              "Foreign African pop. %","Income ('000 € per capita)"),
         column.labels = c("Country", "Centre-north", "South and islands", 
                           "South", "Islands"))
```

When the South (this time both including and excluding the two main islands) is considered in isolation from the rest of the country, unemployment unambiguously is not a positive correlate of electoral support: as a matter of fact, it is significantly but *negatively correlated*. Only a *lower* per capita income seems to be an important determinant of the support for the Movement throughout the country. 

**In conclusion**, unemployment is not a satisfactory explanation for the unprecedented result of the M5S in the South, whether we consider the median income of the population or not or whether we consider the results in the 2018 general election both in terms of absolute votes or in terms of the relative increase in the number from the previous general elections. A "standalone" economic answer as explanation of the success M5S is far from convincing. I am not disputing that economic anxiety plays a role in the electorate of the M5S. But unemployment is not necessarily part of the mix. According to the Itanes 2013 electoral survey, M5S voters are more likely to be found in the workforce than non-M5S voters although crucially are more likely to experience economic hardship (see my *[Road to Rome: The organisational and political success of the M5S](https://poppoliticsaus.wordpress.com/2016/07/08/road-to-rome-the-organisational-and-political-success-of-the-m5s/)*). 

## Onsite participation: Meetup events

```{r, include=FALSE}
library(DBI)
library(RPostgreSQL)
library(rpostgis)
options(scipen=999)

conn <- dbConnect(
  drv = PostgreSQL(),
  dbname = "istat_sez2011",
  host = "localhost",
  port = "5432",
  user = "francesco",
  password = "")


meetup_event <- dbReadTable(conn, c('ita_ge18','meetup2018_event'))
meetup_event_all <- meetup_event
meetup2018_comuni2018 <- dbReadTable(conn, c('ita_ge18','meetup2018_comuni2018'))
meetup_venue <- dbGetQuery(conn, "SELECT venue_id, lon, lat FROM ita_ge18.meetup2018_venue")

meetup_event <- merge(meetup_event, meetup2018_comuni2018, by.x = 'venue', by.y='venue_id', all = FALSE)

meetup_event$time <- as.Date(meetup_event$time)
meetup_event <- subset(meetup_event, time > as.Date('2018-03-04')-90)
meetup_event <- merge(meetup_event, meetup_venue, by.x = 'venue', by.y = 'venue_id')

meetup_event <- SpatialPointsDataFrame(meetup_event[c('lon','lat')], data = meetup_event)

library(raster)  
library(ggthemes)

ita_sp <- 
  readOGR('/Users/francesco/Desktop/GIS_Data/Administrative units/Italy/ITA_adm/',
        'ITA_adm0_simp')
ita_rast <- raster(ita_sp)
res(ita_rast) <- 0.2
ita_rast <- rasterize(ita_sp, ita_rast)
quads <- as(ita_rast, 'SpatialPolygons')


meetup_event_r <- rasterize(coordinates(meetup_event), ita_rast, fun='count', background=0)
meetup_event_spdf <- as(meetup_event_r, "SpatialPixelsDataFrame")
meetup_event_df <- as.data.frame(meetup_event_spdf)
colnames(meetup_event_df) <- c("value", "x", "y")

```


Since 2005, the M5S has employed [Meetup.com](http://www.meetup.com) to allow militants or simple sympathisers to meet in Italy and around the world. By interrogating the [API](https://www.meetup.com/meetup_api/) of Meetup.com for all groups linked to Beppe Grillo or the M5S, I am able to map the frequencies of events organised over time by the groups. 

```{r meetup-ts, fig.cap = paste0("Frequency of M5S meetup events (n=", nrow(meetup_event_all), ") worldwide")}

require(dplyr)
meetup_event_day <-
  meetup_event_all %>%
  group_by(date = as.Date(time)) %>%
  summarize(n = n())

require(zoo)
meetup_event_day$ma30 <- rollmean(meetup_event_day$n, fill = NA, k = 30)

date_vday_07 <- as.Date('2007-09-08')
date_admin_elections_12 <- as.Date('2012-05-06')
date_sicily_election_12 <- as.Date('2012-10-28')
date_gener_election_13 <- as.Date('2013-02-24')
date_europ_election_14 <- as.Date('2014-05-22')
date_referendum_16 <- as.Date('2016-12-04')
date_gener_election_18 <- as.Date('2018-03-04')

label_df <- data.frame(y = -10,
                       x = c(date_vday_07, date_admin_elections_12, date_sicily_election_12,
                             date_gener_election_13, date_europ_election_14,
                             date_referendum_16, date_gener_election_18),
                       label = 1:7)

ggplot(meetup_event_day, aes(x=date)) +
  geom_line(aes(y=n), alpha = 0.2) + 
  geom_line(aes(y=ma30)) + 
  geom_label(data=label_df, aes(x,y,label=label)) +
  theme_bw() +
  labs(x=NULL,y="Number of meetup events", caption = '1: VDay, 2: Administrative election, 3: Sicilian regional election, 4: General elections,\n5: European Parliament election, 6: Constitutional referendum, 7: General elections')

```

Elections (but also the [V-Day](https://en.wikipedia.org/wiki/Five_Star_Movement#V-Days) of 2007) are clearly associated with a spike in mobilisation. Assuming that the number of events organised throughout the country is geographically correlated with the support for the Movement, I calculate the territorial density of events organised in the 90 days preceding the election and I test its association with local electoral support.

```{r plot-density, fig.cap = paste0('Distribution of meetup events (n=', nrow(meetup_event),') in the 90 days before the election (right panel) and their density computed for each cell of a grid covering the entire national territory (left panel)')}

require(viridis)

ggplot() +
  geom_polygon(data = ita_sp, aes(long,lat,group=group), fill=NA, colour='black') + 
  geom_polygon(data = quads, aes(long,lat,group=group), fill=NA, colour='black', size = .1) +
  geom_point(data = as.data.frame(meetup_event), aes(lon,lat), alpha = 0.5, size = .6) +
  coord_equal() +
  theme_map()

ggplot() +
  geom_tile(data=meetup_event_df, aes(x=x, y=y, fill=sqrt(value)), alpha=0.8) + 
  scale_fill_viridis() +
  geom_polygon(data = ita_sp, aes(long,lat,group=group), fill=NA, colour='black') + 
  coord_equal() +
  theme_map() +
  guides(fill=guide_legend(title="Meetup event density √")) +
  theme(legend.position="bottom") +
  theme(legend.key.width=unit(2, "cm"))
```

```{r spatial-analysis}

v <- extract(meetup_event_r, comuni_sp, fun = mean)
comuni_meetup_density <- data.frame(v)
comuni_meetup_density$v[is.na(comuni_meetup_density$v)] <- 0

comuni_df$meetup_density <- comuni_meetup_density$v[match(comuni_df$PRO_COM_T, comuni_sp$PRO_COM_T)]


comuni_df$M5S_2018_perc <-
  comuni_df$M5S_2018 / comuni_df$ppdt_nAssignedVotes_2018
  
m5s_camera_mod <- 
  lm(data = comuni_df, 
     formula = paste0("M5S_2018_perc~",base_formula,"+sqrt(meetup_density)"))

m5s_camera_mod_centrenorth <- 
  lm(data = subset(comuni_df, macro_region %in% c("North-west","North-east","Centre")), 
     formula = paste0("M5S_2018_perc~",base_formula,"+sqrt(meetup_density)"))

m5s_camera_mod_southislands <- 
  lm(data = subset(comuni_df, macro_region %in% c('South','Islands')), 
     formula = paste0("M5S_2018_perc~",base_formula,"+sqrt(meetup_density)"))

m5s_camera_moddiff <- 
  lm(data = comuni_df, 
     formula = paste0("M5S_diff~",base_formula,"+sqrt(meetup_density)"))

m5s_camera_moddiff_centrenorth <- 
  lm(data = subset(comuni_df, macro_region %in% c("North-west","North-east","Centre")), 
     formula = paste0("M5S_diff~",base_formula,"+sqrt(meetup_density)"))

m5s_camera_moddiff_southislands <- 
  lm(data = subset(comuni_df, macro_region %in% c('South','Islands')), 
     formula = paste0("M5S_diff~",base_formula,"+sqrt(meetup_density)"))

library(QuantPsyc)
coef_lmbeta <- lm.beta(m5s_camera_moddiff_southislands)

```

Strength and significance of the territorial density of events is controlled for together with the strength and significance of the variables of the previous models. The coefficients below have been standardised so that the variance of both dependent and independent variables is 1. 

```{r results='asis'}

library(lm.beta)

m5s_camera_mod_beta <- lm.beta(m5s_camera_mod)
m5s_camera_mod_centrenorth_beta <- lm.beta(m5s_camera_mod_centrenorth)
m5s_camera_mod_southislands_beta <- lm.beta(m5s_camera_mod_southislands)
m5s_camera_moddiff_beta <- lm.beta(m5s_camera_moddiff)
m5s_camera_moddiff_centrenorth_beta <- lm.beta(m5s_camera_moddiff_centrenorth)
m5s_camera_moddiff_southislands_beta <- lm.beta(m5s_camera_moddiff_southislands)

stargazer(m5s_camera_mod, 
          m5s_camera_mod_centrenorth,
          m5s_camera_mod_southislands,
          m5s_camera_moddiff,
          m5s_camera_moddiff_centrenorth,
          m5s_camera_moddiff_southislands,
         type = 'html', 
         omit.stat = c("f","ser"),
         dep.var.labels = c("M5S (% 2018)", "M5S (% difference 2018-2013)"),
         covariate.labels = c("Pop. density (log)", "Unemployment %", 
                              "Housewives %", "Not workforce below 65 %", 
                              "Over 65%", "With degree %", "Foreign pop. %", 
                              "Foreign African pop. %","Income ('000 € per capita)", 
                              "Meetup event density √"),
         column.labels = c("Country", "Centre-North", "South and islands", 
                           "Country", "Centre-North", "South and islands")
         ,coef = list(m5s_camera_mod_beta$standardized.coefficients,
                     m5s_camera_mod_centrenorth_beta$standardized.coefficients,
                     m5s_camera_mod_southislands_beta$standardized.coefficients,
                     m5s_camera_moddiff_beta$standardized.coefficients,
                     m5s_camera_moddiff_centrenorth_beta$standardized.coefficients,
                     m5s_camera_moddiff_southislands_beta$standardized.coefficients)
         ,p= list(summary(m5s_camera_mod_beta)$coefficients[,5],
                  summary(m5s_camera_mod_centrenorth_beta)$coefficients[,5],
                  summary(m5s_camera_mod_southislands_beta)$coefficients[,5],
                  summary(m5s_camera_moddiff_beta)$coefficients[,5],
                  summary(m5s_camera_moddiff_centrenorth_beta)$coefficients[,5],
                  summary(m5s_camera_moddiff_southislands_beta)$coefficients[,5])
         )

```

1. **Unemployment** is associated with support for the M5S only in the Centre-North. In the South unemployment is only significant --- but negatively correlated --- when we measure its association with the *increase* in votes. 

2. A lower **Income** is consistently a predictor of support for the Movement. 

3. **Foreign population**, but **not foreign African population**  is overall less present in areas where the Movement is strong.

3. Generally, the M5S is stronger in areas that are **inhabited more densely and by younger people**. 

4. Finally, the capillarity of the local organisation is **strongly associated with support in the South** (in fact, the strongest positive predictor) but **not in the Centre-North** if you look at the absolute support. 

In **conclusion**, the economic environment is a good predictor of the performance of the Movement but I argue what drives votes in the South is different from what drives votes in the rest of the country. 

1. In the Centre-North, we can picture the average voter as younger more educated *but* more likely to be unemployed. It is also more likely to live in economically less dynamic areas (urban but poorer with less immigration and less participation to the workforce).

2. In the South (islands included), the picture is different. The average M5S voter lives in areas that are younger (although education does not play a role), urban, poorer but crucially with *lower* unemployment and (if we look at the absolute number of votes) with higher participation to the workforce. This can be probably explained by the high levels of underemployment and poorly paid unemployment that characterise the South. Political participation has been largely underestimated has driver of the Southern vote: it is a very strong predictor for both absolute support and for increase in support. 

## It's not (only) the economy, stupid


Participation matters in creating *new* consent. The economic narrative of the M5S has of course played an important part in motivating voters. But the explanatory trajectory for the Southern vote

**unemployment** → **political offer of guaranteed minimum income** → **electoral support** 

is simply not supported by the data. Political decisions such a voting are always complex and naturally open to multiple reading. Considerations about personal and community's economic wealth are important. But so is political ideology. The ideology of the M5S is thin and because of this mostly overlooked. The Movement's political platform is non traditional in the sense that is not really about economic justice, national sovereignty, or environment preservation. Indeed on these issues a whole range of opinions are present in the Movement. The M5S is not progressive or conservative. The ideology of the Movement is defined by the reconfiguration of mass participation: from representative, mediated by party and parliament, to direct and unmediated. The results of the analysis of the density of onsite events organised by groups linked to the Movement indeed supports the importance of the experience of individual participation in a macro region (the South) plagued by political dysfunctionalities. Participation must not be directly experienced (the large majority of M5S voters clearly never attended a meeting) but the mere presence of a locally grounded network of participating and self-organised individuals is probably enough to give credibility to the ideological message of political disruption by a direct reappropriation of political power.

*A replication package for the analysis presented in the article is available [here](https://github.com/fraba/scrape_2018_ita_ge_results/tree/master/output_html).*

```{r}
save.image(file = '/Users/francesco/public_git/scrape_2018_ita_ge_results/output_html/blog_post_image.RData')
```

